🎯 What We'll Cover
This is the one place in the entire course where we recommend a paid tool, so it gets the same calibrated treatment the course gives every other claim: an honest reckoning, not a sales pitch. This lesson covers what Claude Code costs, the equity problem that cost creates, what you can genuinely approximate on free tools, and — the part that changes how you work — the disposition shift from chatting to managing an agent.
The aim is that by the end you can decide, for yourself and honestly, whether this track is worth it for your situation. For some readers it clearly will be; for others it clearly will not be, and saying so plainly is part of keeping faith with the rest of the course.
💰 The Cost Picture, Honestly
Claude Code is not free. As of mid-2026 it is available through Anthropic's paid subscription tiers and via metered API usage; the entry subscription sits at roughly the price of a couple of streaming services per month, and the heavier tiers cost substantially more. Because these numbers change, this lesson deliberately does not pin an exact figure — check the current Anthropic pricing page rather than trusting a number a course page wrote months ago. (That habit — trust the live source, not the cached claim — is the Week 9 disposition applied to pricing.)
There are two cost shapes worth distinguishing. A flat subscription gives you a generous but bounded amount of use for a predictable monthly fee — the right model for most individual researchers. Metered API use is pay-as-you-go and can run up quickly when an agent works autonomously for an hour across a large codebase: that is real money per run, and worth watching. Either way, the honest framing from Week 3 applies — the cost is not only the invoice. It is also tokens, time, attention, and the review effort every piece of agent output demands.
The practical question this raises is how to not be surprised by it. Two honest cautions. On a subscription your usage is generous but not unlimited — sustained agentic work can reach the cap, at which point you wait for a reset, so budget for that if a deadline is near. On metered API use the spend concentrates in long autonomous runs: an agent working unattended for an hour across a large project is where a bill grows, far more than a quick exchange. Helpfully, the instincts that keep you safe also keep you cheap — a tightly-scoped task in plan mode costs less than “go work out my whole project,” and the pre-registration gates of Lesson B.2 stop you pouring compute into a question that has already failed its test.
💰 The bill is not only the tokens
Remember the burden named in the disposition shift below: every agentic result needs checking, and that review time is part of the true cost. A run that is cheap in tokens but produces an hour of plausible-but-wrong work you then have to untangle was not cheap. Scope tightly, checkpoint often, and the spend — of money and of attention — stays proportionate to the stakes.
🌍 The Equity Tension, Named
This course has been militantly free-tier-first, and that was a deliberate choice with an argument behind it. Week 10 made the case that “just pay for the Pro plan” is not advice but an assumption that excludes most of the people the course is for. Week 11.4 put numbers on the African compute gap. A paid-tool track sits in direct tension with all of that, and the worst thing we could do is pretend otherwise.
So, plainly: for a researcher in a well-funded lab, the subscription cost is a rounding error and this track is a straightforward yes. For a postgraduate paying out of pocket, in a department without an AI budget, on a currency that makes dollar-denominated subscriptions painful, it is a genuine barrier — the same barrier this course spent eleven weeks taking seriously. This track is the signposted exception, not a quiet reversal. It comes after the assessed core precisely so that nothing required of you depends on being able to afford it.
🎥 What you can approximate for free
You can get part of the way without paying. The free Claude.ai web app, combined with manual file discipline — you keep a real project folder, you paste files in, you save the outputs and decisions back into that folder yourself — gives you the most important idea of this whole track: the chat is not the archive. You can keep a decision log, a data inventory, and separated outputs entirely by hand.
What you lose without the paid agent is the automation: it will not read and write your files for you, run your code, drive Git, or use Skills and subagents. You become the harness — doing by hand the file operations the agent would otherwise do. That is slower and more error-prone, but it is not nothing, and the reproducibility discipline in Lesson B is valuable whether a paid agent enforces it or you do.
🧰 The Disposition Shift: You Are Managing an Agent Now
The hardest part of using Claude Code well is not technical. It is a shift in posture. In a chat you prompt and read, prompt and read — you are in the loop on every step. With an agent you delegate: you hand over a multi-step task and the agent goes away and does it. The skill is no longer prompt-craft. It is judgement — about what to delegate, where to put checkpoints, and how to verify what comes back.
Week 11.1 borrowed Ethan Mollick's framing for exactly this moment: we are moving from working with a co-intelligence to managing what he calls a wizard — a system that produces sophisticated work through a process you did not watch and cannot fully see. His paradox holds here with force: competence and opacity rise together (Mollick, “On Working with Wizards,” 2025). The better the agent gets, the more it can do unsupervised, and the harder it becomes to verify. And Week 10's Princeton reliability finding (Rabanser et al., 2026) is the warning underneath: an agent that runs for an hour can do an hour of confident, plausible, wrong work. The verification burden does not shrink as the tool improves. It grows.
👤 From the instructor's own practice
“There’s a careful balance needed between micromanaging and making sure that there are clear boundaries and checkpoints. You don’t want to let it loose in an unstructured way, and inherently there are dangers in anything agentic, but there needs to be some balance if it is going to be useful. It took a while to figure out the right oversight to give it, but now I treat it like a very capable, overenthusiastic graduate student who I have to keep a careful eye on at each checkpoint that I set.”
“Setting checkpoints is key, but it’s not enough. Because the volume of output is now so large, we are going to have to figure out how to both be effective and careful.”
There is one more thing Week 11.1 said that lands hardest here: two researchers using the identical tool can have wildly different experiences. The agent does not make you a good researcher. It amplifies whatever practice you bring to it. A careful researcher with strong verification habits gets a powerful collaborator; a careless one gets a faster way to produce confident nonsense. The rest of this track is, in effect, about being the first kind.
🧠 The Human Core: What Not to Automate
A workflow this automated has a failure mode that matters more than any bug: it can let you produce a paper without doing the research. Everything in Lesson B is worth using precisely because it clears away drudgery — but the drudgery was never the research. The research is the thinking, and the thinking has to be yours. So before the structure arrives, here is the counterweight to the whole track: the work you must never hand over, however capable the agent becomes.
🤖 A co-scientist, not a substitute scientist
The agent is for the work that is mechanical (fetching, extracting, formatting), repetitive (checking fifty claims or two hundred citations), and adversarial (an honest critic that will not flatter you). It is not for the work that is generative, interpretive, or authorial. The moment you let it do your thinking, reading, or writing, you have stopped doing research.
The idea is yours. Taste — noticing what is strange, what is beautiful, what is worth a year of your life — is the whole game, and it comes from immersion, conversation, teaching, and play, not from a prompt. Do not ask the agent for your research questions. Bring an idea you already care about and use the agent to stress-test it: surface the load-bearing assumption, find the paper that already did it, ask the awkward question.
Reading is yours. Turning a PDF into Markdown is not reading it. The argument you have with a paper, the marginal note, the slow accretion of a mental map of a field, the connection that fires only in your own head — that is where research understanding is actually built. An agent can put a paper in front of you in a workable form; it cannot do the reading. Read deeply, and read more than the agent summarises.
Writing is thinking — take this one most seriously of all. Writing is not the transcription of finished thoughts; it is how the thoughts get finished. Putting an argument into sentences is what exposes the gap in the logic, forces the definition you were fudging, and tells you what you actually believe. If the agent writes your draft, you skip exactly the thinking the writing was meant to do — and you end up defending prose you never reasoned your way to. So: you write the draft. The agent’s job around your writing is to critique it, check its claims, and catch inconsistencies — not to produce it, and not to supply your voice.
Judgement is yours. What a surprising result means, whether a finding is interesting or merely true, which thread to pull next — these are judgements, and judgement does not come off a shelf. The agent can lay out the options and the evidence; the choosing is yours.
The struggle is not a bug. Some of the friction in research — the stuck week, the fourth rewrite, the confusion that sits just before understanding — is not waste to be optimised away. It is often exactly where the understanding is forged. Automate the drudgery, by all means; be wary of automating away the productive struggle along with it.
🧪 The acid test
At the end, can you defend every idea, every claim, and every sentence as your own thinking? If yes, the agent helped you do research. If the agent did the thinking, you have a paper but you have not done research — and your examiners, your reviewers, and your future self will eventually find the hollow centre.
This section is drawn from the instructor’s companion guide, Claude Code as a Co-Scientist, available to download at the end of Lesson A.1. Keep it in view as you read Lesson B: the structure there exists to support this thinking, never to replace it.
⚖️ Discipline Proportionate to Stakes
Before Lesson B lays out a fair amount of structure — immutable raw data, decision logs, reproducible folders — it is worth heading off a misreading. None of this is a ritual you must perform on every interaction. The honest practice is to match the discipline to the stakes and the lifespan of the work.
👤 From the instructor's own practice
“I’ve got many different projects. Some need serious scaffolding — to make sure everything is reproducible and auditable, and that I understand all of the underlying processes. Others need much less, but still have agentic aspects. An example of the latter is a project for my fitness goals, which reads in my smartwatch data and adjusts the training plan accordingly: the stakes there are low, and I’ll question it if it feels really off.”
“But something that’s going to be published has much higher stakes — and so it needs much more scaffolding.”
Read Lesson B in that spirit. It shows you the full apparatus so you know what good looks like when the stakes are high. You then apply as much of it as the task in front of you deserves — which for an exploratory afternoon might be almost none, and for the analysis behind a paper figure should be most of it.
📝 The disposition, in three lines
Delegate, then verify — the burden of checking is yours and it grows with the agent's competence.
The tool amplifies your practice; it does not replace your judgement.
Scaffold in proportion to the stakes — heavy where the work must last, light where it need not.
Coming up in A.3: your first real session. We open a deliberately messy research archive, use plan mode to inspect it without changing anything, watch how permissions work, and meet the single most important file in any Claude Code project — CLAUDE.md, the control surface.